A small assembler for the MIPS

This is part of the code generator for Standard ML of New Jersey. We generate code in several stages. This is nearly the lowest stage; it is like an assembler. The user can call any function in the [[MIPSCODER]] signature. Each one corresponds to an assembler pseudo-instruction. Most correspond to single MIPS instructions. The assembler remembers all the instructions that have been requested, and when [[codegen]] is called it generates MIPS code for them.

Some other structure will be able to use the MIPS structure to implement a [[CMACHINE]], which is the abstract machine that ML thinks it is running on. (What really happens is a functor maps some structure implementing [[MIPSCODER]] to a different structure implementing [[CMACHINE]].)

Any function using a structure of this signature must avoid touching registers 1 and 31. Those registers are reserved for use by the assembler.

@ Here is the signature of the assembler, [[MIPSCODER]]. It can be extracted from this file by

$\displaystyle \tt notangle mipsinstr.nw -Rsignature$.

«signature»= signature MIPSCODER = sig

(* Assembler for the MIPS chip *)

eqtype Label datatype Register = Reg of int (* Registers 1 and 31 are reserved for use by this assembler *) datatype EA = Direct of Register | Immed of int | Immedlab of Label (* effective address *)

structure M : sig

(* Emit various constants into the code *)

val emitstring : string -> unit (* put a literal string into the code (null-terminated?) and extend with nulls to 4-byte boundary. Just chars, no descriptor or length *) exception BadReal of string val low_order_offset : int (* does the low-order word of a floating point literal come first (0) or second (1) *) val realconst : string -> unit (* emit a floating pt literal *) val emitlong : int -> unit (* emit a 4-byte integer literal *)

(* Label bindings and emissions *)

val newlabel : unit -> Label (* new, unbound label *) val define : Label -> unit (* cause the label to be bound to the code about to be generated *) val emitlab : int * Label -> unit (* L3: emitlab(k,L2) is equivalent to L3: emitlong(k+L2-L3) *)

(* Control flow instructions *)

val slt : Register * EA * Register -> unit (* (operand1, operand2, result) *) (* set less than family *) val beq : bool * Register * Register * Label -> unit (* (beq or bne, operand1, operand2, branch address) *) (* branch equal/not equal family *)

val jump : Register -> unit (* jump register instruction *)

val slt_double : Register * Register -> unit (* floating pt set less than *) val seq_double : Register * Register -> unit (* floating pt set equal *) val bcop1 : bool * Label -> unit (* floating pt conditional branch *)

(* Arithmetic instructions *) (* arguments are (operand1, operand2, result) *)

val add : Register * EA * Register -> unit val and' : Register * EA * Register -> unit val or : Register * EA * Register -> unit val xor : Register * EA * Register -> unit val sub : Register * Register * Register -> unit val div : Register * Register * Register -> unit (* first arg is some register guaranteed to overflow when added to itself. Used to detect divide by zero. *) val mult : Register * Register * Register -> unit val mfhi : Register -> unit (* high word of 64-bit multiply *)

(* Floating point arithmetic *)

val neg_double : Register * Register -> unit val mul_double : Register * Register * Register -> unit val div_double : Register * Register * Register -> unit val add_double : Register * Register * Register -> unit val sub_double : Register * Register * Register -> unit

(* Move pseudo-instruction : move(src,dest) *)

val move : EA * Register -> unit

(* Load and store instructions *) (* arguments are (destination, source address, offset) *)

val lbu : Register * EA * int -> unit (* bytes *) val sb : Register * EA * int -> unit val lw : Register * EA * int -> unit (* words *) val sw : Register * EA * int -> unit val lwc1: Register * EA * int -> unit (* floating point coprocessor *) val swc1: Register * EA * int -> unit val lui : Register * int -> unit

(* Shift instructions *) (* arguments are (shamt, operand, result) *) (* shamt as Immedlab _ is senseless *)

val sll : EA * Register * Register -> unit val sra : EA * Register * Register -> unit

(* Miscellany *)

val align : unit -> unit (* cause next data to be emitted on a 4-byte boundary *) val mark : unit -> unit (* emit a back pointer, also called mark *)

val comment : string -> unit

end (* signature of structure M *)

val codegen : unit->unit

val codestats : outstream -> unit (* write statistics on stream *)

end (* signature MIPSCODER *) @ The basic strategy of the implementation is to hold on, via the [[kept]] pointer, to the list of instructions generated so far. We use [[instr]] for the type of an instruction, so [[kept]] has type [[instr list ref]].

The instructions will be executed in the following order: the instruction at the head of the [[!kept]] is executed last. This enables us to accept calls in the order of execution but add the new instruction(s) to the list in constant time.

@ We structure the instruction stream a little bit by factoring out the different load and store instructions that can occur: we have load byte, load word, and load to coprocessor (floating point). «types auxiliary to [[instr]]»= datatype size = Byte | Word | Floating @ Here are the instructions that exist. We list them in more or less the order of the MIPSCODER signature. «definition of [[instr]]»= «types auxiliary to [[instr]]»

datatype instr = STRINGCONST of string (* constants *) | EMITLONG of int

| DEFINE of Label (* labels *) | EMITLAB of int * Label

| SLT of Register * EA * Register (* control flow *) | BEQ of bool * Register * Register * Label | JUMP of Register | SLT_D of Register * Register | SEQ_D of Register * Register | BCOP1 of bool * Label

| NOP (* no-op for delay slot *)

| ADD of Register * EA * Register (* arithmetic *) | AND of Register * EA * Register | OR of Register * EA * Register | XOR of Register * EA * Register | SUB of Register * Register * Register | MULT of Register * Register | DIV of Register * Register | MFLO of Register (* mflo instruction used with 64-bit multiply and divide *) | MFHI of Register

| NEG_D of Register * Register | MUL_D of Register * Register * Register | DIV_D of Register * Register * Register | ADD_D of Register * Register * Register | SUB_D of Register * Register * Register

| MOVE of EA * Register (* put something into a register *) | LDI_32 of int * Register (* load in a big immediate constant (>16 bits) *) | LUI of Register * int (* Mips lui instruction *)

| LOAD of size * Register * EA * int (* load and store *) | STORE of size * Register * EA * int

| SLL of EA * Register * Register (* shift *) | SRA of EA * Register * Register

| COMMENT of string (* generates nothing *) | MARK (* a backpointer *)

| BREAK of int (* break instruction *) @ Here is the code that handles the generated stream, [[kept]]. It begins life as [[nil]] and returns to [[nil]] every time code is generated. The function [[keep]] is a convenient way of adding a single [[instr]] to the list; it's very terse. Sometimes we have to add multiple [[instr]]s; then we use [[keeplist]]. We also define a function [[delay]] that is just like a [[keep]] but it adds a NOP in the delay slot. «instruction stream and its functions»= val kept = ref nil : instr list ref fun keep f a = kept := f a :: !kept fun delay f a = kept := NOP :: f a :: !kept fun keeplist l = kept := l @ !kept «reinitialize [[kept]]»= kept := nil @


Subsections